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Abstract 

'Causal' direction is of great importance when dealing with complex systems. Often big volumes of data in the form of time 
series are available and it is important to develop methods that can inform about possible causal connections between the 
different observables. Here we investigate the ability of the Transfer Entropy measure to identify causal relations embedded 
in emergent coherent correlations. We do this by firstly applying Transfer Entropy to an amended Ising model. In addition 
we use a simple Random Transition model to test the reliability of Transfer Entropy as a measure of 'causal' direction in the 
presence of stochastic fluctuations. In particular we systematically study the effect of the finite size of data sets. 
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Introduction 

Many complex systems are able to self-organise into a critical 
state [1,2]. The local properties of the system will typically 
fluctuate in time and space but the way the fluctuations are 
interrelated or correlated may differ. In this context a critical state 
is defined in terms of the way in which the correlations of the local 
fluctuations decay in space and time. When a system isn't critical, 
the correlations of the fluctuations of a quantity A measured in two 
different positions at two different times, say A(Yo,to) and 
A{ro-\-r,to-\-t) decay as an exponential function of the separation 
in space |r| and also decay exponentially as function of the 
separation in time t. However in a critical state the correlations 
exhibit a much slower algebraic decay, i.e. the correlation 
functions decay as negative powers of |r| and t. This is the 
behaviour observed at second order phase transitions in thermal 
equilibrium, which are denoted the critical points. The slow 
algebraic decay of correlations is equivalent to correlations 
effectively spanning across the entire system. Or in other words, 
in the critical state local distortions can propagate throughout the 
entire system [2-4]. We address here how to identify directed 
stochastic causal connections embedded in a background of 
strongly correlated stochastic fluctuations. 

Most of 'causality' and directionality measures have been tested 
on low dimension systems and neglect addressing the behaviour of 
systems consisting of large numbers of interdependent degrees of 
freedom that is a main feature of complex systems. From a 
complex systems point of view, on one hand there is the system as 
a whole (collective behaviour) and on another there are individual 
interactions that lead to the collective behaviour. A measure that 
can help understand and differentiate these two elements is 
needed. We shall first seek to make a clear definition of 'causality' 
and then relate this definition to complex systems. We outline the 



different approaches and measures used to quantify this type of 
'causality'. We highlight that for multiple reasons. Transfer 
Entropy seems to be a very suitable candidate for a 'causality' 
measure for complex systems. Consequently we seek to shed some 
light on the usage of Transfer Entropy on complex systems. 

To improve our understanding of Transfer Entropy we study 
two simplistic models of complex systems which in a very 
controllable way generate correlated time series. Complex system 
whose main characteristic consist in essential cooperative behav- 
iour [5] takes into account instances when the whole system is 
interdependent. Therefore, we apply Transfer Entropy to the 
(amended) Ising model in order to investigate its behaviour at 
different temperatures particularly near the critical temperature. 
Moreover, we are also interested in investigating the different 
magnitude of Transfer Entropy in general (which is not fully 
understood [6]) by looking at the effect of different transition 
probabilities, or activity levels. We discuss the interpretation of the 
different magnitudes of the Transfer Entropy by varying transition 
rates in a Random Transition model. 

Quantifying 'Causality' 

The quantification of 'causality' was first envisioned by the 
mathematician Wiener [7] who propounded the idea that the 
'causality' of a variable in relation to another can be measured by 
how well the variable helps to predict the other. In other words, 
variable Y 'causes' variable X if the ability to predict X is 
improved by incorporating information about Y in the prediction 
of X. The conceptualisation of 'causality' as envisioned by Wiener 
was formulated by Granger [8] leading to the establishment of the 
Wiener-Granger framework of 'causality'. This is the definition of 
'causality' that we shall adopt in this paper. 
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In literature, references to 'causality' take many guises. The 
term directionality, information transfer and sometimes even 
independence can possibly refer to some sort of 'causality' in line 
with the Wiener-Granger framework. Continuing the assumption 
that Y causes X, one would expect the relationship between X 
and Y to be asymmetric and that the information flows in a 
direction from the source Y to the target X. One can assume that 
this information transfer is the unique information provided by the 
causal variable to the affected one. When one variable causes 
another variable, the affected variable (the target) will be 
dependent (to certain extent) on the causal variable (the source). 
There must exist a certain time lag however small between the 
source and the target [9-1 1], this will be henceforth referred to as 
the causal lag [8]. One could also say the Wiener-Granger 
framework of prediction based 'causality' is equivalent to looking 
for dependencies between the variables at a certain causal lag. 

Roughly, there are two different approaches in establishing 
'causality' in a system. One approach is to make a qualified guess 
of a model that will fit the data, called the confirmatory approach 
[12]. Models of this nature are typically very field specific and rely 
on particular insights into the mechanism involved. A contrasting 
approach known as the exploratory approach, infers 'causal' 
direction from the data. This approach does not rely on any 
preconceived idea about underlying mechanisms and let results 
from data shape the directed model of the system. Most of the 
measures within the Wiener-Granger framework falls into this 
category. One can think of the different approaches as being on a 
spectrum from purely confirmatory to purely exploratory. 

The nature of complex systems calls for the exploratory 
approach. The abundance of data emphasises this even more so. 
In fact 'causality' measures in the Wiener Granger framework 
have been increasingly utilised on data sets obtained from complex 
systems such as the brain [13,14] and financial systems [15]. 
Unfortunately, most of the basic testings of the effectiveness of 
these measures are mostly done on dynamical systems [16-18] or 
simple time series, without taking into account the emergence of 
collective behaviour and criticality. Complex systems are typically 
stochastic and thus different from deterministic systems where the 
internal and external influences are distinctly identified. As 
mentioned above, here we focus on the emergence of collective 
behaviour in complex systems and in particular on how the 
intermingling of the collective behaviour with individual (coupled) 
interactions complicates the identification of 'causal' relationships. 
Identifying a measure that is able to distinguish between these 
different interactions will obviously help us to improve our 
understanding of the dynamics of complex systems. 

Transfer Entropy 

Within the Wiener-Granger framework, two of the most 
popular 'causality' measure are Granger Causality (G-causality) 
and its nonlinear analog Transfer Entropy. G-causality and 
Transfer Entropy are exploratory as their measures of causality 
are based on distribution of the sampled data. The standard steps 
of prediction based 'causality' that underlies these measures can be 
summarized as follows. Say we want to test whether variable Y 
causes variable X. The first step would be to predict the current 
value of X using the historical values of X. The second step is to 
do another prediction where the historical values of Y and X are 
both used to predict the current value of X. And the last step 
would be to compare the former to the latter. If the second 
prediction is judged to be better than the first one, then one can 
conclude that Y causes X. This being the main idea, we outline 
why Transfer Entropy is more suitable for complex systems. 



Granger causality is the most commonly used 'causality' 
indicator [9] . However, in the context of the nonlinearities of a 
complex systems (collective behaviour and criticality being the 
main example), using G-causality may not be sufficient. Moreover, 
the inherently linear autoregressive framework makes G-causality 
less exploratory than Transfer Entropy. Transfer Entropy was 
defined [16,17] as a nonlinear measure to infer directionality using 
the Markov property. The aim was to incorporate the properties of 
Mutual Information and the dynamics captured by transition 
probabilities in order to understand the concept and exchange of 
information. More recently, the usage of Transfer Entropy to 
detect causal relationships [19-21] and causal lags (the time 
between cause and effect) has been further examined [6,22]. Thus 
we are especially interested in Transfer Entropy due to its 
propounded ability to capture nonlinearities, its exploratory nature 
as well as its information theoretic background that provides 
information transfer related interpretation. Unfortunately, some of 
the vagueness in terms of interpretation may cause confusion in 
complex systems. The rest of the paper is an attempt to discuss 
these issues in a reasonably self-contained manner. 

Mutual Information based measures 

Define random variables X, Y and Z with discrete probability 
distributions px(x),XEX, pY(y),yGy and Pz(z),zeZ. The entropy 
of X is defined [23,24] as 

H(X)=-Y,Px(x)\ogpx{x) (1) 

where log to the base e and 0 log 0 = 0 is used. The joint entropy 
of X and Y is defined as 

H(X, Y)=-Y^ Y.PxY(x,y) \ogpxYix,y) (2) 

and the conditional entropy can be written as 

H{X\ Y)=-Y^ Y.PxYix,y)\ogpx\Y(x\y) (3) 
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Figure 1. Susceptibility x on the Ising model with lengths 
Z.=1 0,25,50,1 00 obtained using equation (9). Peaks can be seen at 
respective Tc. 

doi:10.1371/journaLpone.0099462.g001 



PLOS ONE I www.plosone.org 



2 



June 2014 | Volume 9 | Issue 6 | e99462 



Quantifying 'Causality' in Complex Systems 



0.8 




2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 



Temperature 

Figure 2. Covariance r(A,G) on the Ising model with lengths 
Z.=1 0,25,50,1 00 obtained using equation (10). 

doi:10.1371/joumal.pone.0099462.g002 

where pxY is the joint distribution and px\Y is the respective 
conditional distribution. The Mutual Information [24,25] is 
defined as 

I(X, Y) = H{X) - H{X\ Y). (4) 

Taking into account conditional variables, the conditional 
Mutual Information [19,24] is defined as I(X,Y\Z) = 
H(X\Z) — H(X\Y,Z). A variant of conditional Mutual Informa- 
tion namely the Transfer Entropy was first defined by Schreiber in 
[16]. Let X^ be the variable X that is shifted by t, so that the 
values of = — t) where X(n) is the value of X at time 

step n and similarly for Y. We highlight a simple form of Transfer 
Entropy where conditioning is minimal such that 

r^l = i(x, Y'\x^ ) = H(x\x^ ) - H(x\x^ , r ). (5) 
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Figure 4. Transfer Entropy J^^^^ and J^^^ on the Ising model of 
lengths Z.=50 obtained using equation (5). Peaks for both direction 
are at Tc. 

doi:10.1371/journal.pone.0099462.g004 

The idea is that, if Y causes X at causal lag ty, then TyP > Tyx 
for any lag T since H(X\X\Y'y) <H(X\X\Y') due to the fact 
that Y^Y should provide the most information about the change of 
X^ to X. This simple form allows us to vary the values of time lag 
T in ascertaining the actual causal lag. This form of Transfer 
Entropy was also used in [13,18,22,26,27]. The Transfer Entropy 
in equation (5) can also be written as 

^YX =2^2^ l^Pxx^ '3^)log ' .... . (6) 

xeX x'eX yey ^ X\X^ ^ ' ^ 

Our choice of this simple definition was motivated by the fact that 
it directly captures how the state of Y^{n) — Y(n — t) influences the 
changes in X i.e. from X(n) to X^(n) = X(n— 1). In other words, 
equation (5) is tailor made to measure whether the state of 
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Figure 3. Mutual Information I(A,G) on the Ising model with 
lengths Z.=1 0,25,50,1 00 obtained using equation (4). 

doi:1 0.1 371 /journal. pone.0099462.g003 
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Figure 5. Transfer Entropy T^^^ on the Ising model of lengths 
Z.=1 0,25,50,1 00 obtained using equation (5). Peaks can be seen at 
respective Tc. 

doi:10.1371/journal.pone.0099462.g005 
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Figure 6. Transfer Entropy T^^^ on the Ising model of lengths 
Z.=1 0,25,50,1 00 obtained using equation (5). Peaks can be seen at 
respective Tc. 

doi:10.1371/joumal.pone.0099462.g006 



Y{n — t) influences the current changes in X. This coincides with 
the predictive view of 'causality' in the Wiener-Granger fi:'ame- 
work where the current state of one variable (the source) influences 
the changes in another variable (the target) in the future. The same 
concept will be applied in order to probe this kind of 'causality' in 
our models. 

The Ising Model 

A system is critical when correlations are long ranged. A simple 
prototype example is the Ising model [2] at critical temperature, 
Tc. Away from Tc correlations are short ranged and dies off 
exponentially with separation. We shall apply Transfer Entropy to 
the Ising model in order to investigate its behaviour at different 
temperatures particularly in the vicinity of the critical temperature. 
One can visualize the 2D Ising model as a two dimensional square 
lattice with length L composed of 7V = L^ sites Si^ieM ={\ • • • N}. 
These sites can only be in two possible states, spin-up {Si = 1) or 
spin-down {Si= — 1). We restrict the interaction of the sites to only 
its nearest neighbours (in two dimensions this will be sites to the 
north, south, east and west). Let the interaction strength between / 
and j be denoted by 



/// 



/ > 0, if / and j are nearest neighbours and ijEj\f 



0, 



otherwise 



so that the Hamiltonian (energy), 7i, is given by [2,28] 

ieAf jeAf 



(7) 



(8) 



7i is used to obtain the 



exp(-jgH) 



with P = 



Boltzmann (Gibbs) distribution 
where Kb is the Boltzmann 



J2 QW(-P^) " KbT 
constant and T is temperature. 

We implement the usual Metropolis Monte Carlo (MMC) 
algorithm [2,29,30] for the simulation of the Ising model in two 
dimensions with periodic boundary conditions. The algorithm 
proposed by Metropolis and co-workers in 1953 was designed to 
sample the Boltzmann distribution by artificially imposing 
dynamics on the Ising model. The implementation of the MMC 
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Figure 7. Susceptibility x on the amended Ising model of 
lengths Z.=1 0,25,50,1 00 obtained using equation (9). Peaks can 
be seen at respective Tc. 
doi:10.1371/journal.pone.0099462.g007 

algorithm in this paper is outlined as follows. A site is chosen at 
random to be considered for flipping (change of state) with 
probability y^. The event of considering the change and 
afterwards the actual change (if accepted) of the configuration, 
shall henceforth be referred to as flipping consideration. A sample 
is taken after each N flipping considerations. The logic being that, 
since sites to be considered are chosen randomly one at a time, 
after TV flips, each site will on average have been selected for 
consideration once. The interaction strength is set to be / = 1 and 
the Boltzmann constant is fixed as Kb = 1 for all the simulations. 
We let the system run up to 2000 samples before sampling at every 
N = L^ time steps. 

Through the MMC algorithm, a Markov chain (process) is 
formed for every site on the lattice. The state of each site at each 
sample will be taken as a time step n in the Markov chain {sx)n- 
Let S be the number of samples (length of the Markov chains). To 
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Figure 8. Covariance r(A,G) on the amended Ising model of 
lengths Z.=1 0,25,50,1 00 obtained using equation (10). Peaks can 
be seen at respective Tc, similar to Figure (2) of the Ising model. 
doi:10.1371/journal.pone.0099462.g008 
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Figure 9. Mutual Information I(A,G) on the amended Ising 
model with lengths Z.=1 0,25,50,1 00 obtained using equation 
(4). Not much different from results on the Ising model in Figure 3. 
doi:1 0.1 371 /journal. pone.0099462.g009 

get the probability values for each site, we utilise temporal average. 
All the numerical probabilities obtained for the Ising model in this 
paper have been obtained by averaging over simulations with 
S= 100000 unless stated otherwise. 

Measures on Ising model 

In an infinite two dimensional lattice, the phase transition of the 
Ising model with / = 1 and Kb = 1 is known to occur at the critical 
2 

temperature Tc= ^ 2.269 185 [2]. In a finite system, 

7og(l+V2) 

due to finite size effects, the critical values will not be quite as 
exact, we will call the temperature where the transition effectively 
occurs in the simulation as the crossover temperature T^. 
Susceptibility x is an observable that is normally used to identify 
Tc for the Ising model as seen in Figure (1). In order to define let 
^(n) = I (Si)^ be the sum of spins on a lattice of size N at time 
steps n=\, • • • ,S. The susceptibility [2] is given by 

X=^(E[m(nf]-E[m(nf) (9) 

where E[.] is the expectation in terms of temporal average and T is 
temperature. The covariance on the Ising model can be defined as 

r(X,Y) = r(sx,SY) = E[sxSY]-E[sx]E[sY] (10) 

where X, YeN. 

To display measures applied on individual sites, let sites 
^,5, GgA/" represent coordinates [1,1], [2,2] and [3,3] respectively. 
The values of the covariance T{A,G) and I{A,G) = I{sa,Sg) is 
displayed in Figure (2) and Figure (3). It can be seen that for the 
Ising model. Mutual Information gives no more information than 
covariance. From this figure, one can see that the values are system 
size dependent up to system size L = 50 or TV = 2500. We conclude 
from this, that up to this length scale, correlations are detectable 
across the entire lattice [2] . Thus we shall frequently utilize L = 50 
when illustration is required. 

Using time shifted variables we obtained the Transfer Entropy 
T^YX ~ '^^%x Figures (4-6). By looking at Figure (4) and then 
contrasting Figures (5) and (6), one can see that there is no clear 
difference between T^q^ and T^J]j in the figures thus no direction of 



Figure 10. Transfer Entropy 7^^^ ^ga the amended 

Ising model of lengths L = 50 and ^c; = 10, obtained using 
equation (5). Direction G^A at time lag 10 is indicated. Very 
different from result on Ising model in Figure 4. 
doi:10.1371/journal.pone.0099462.g010 

'causality' can be established between A and G. This is expected 
due to the symmetry of the lattice. More interestingly, the fact that 
Transfer Entropy peaks near Tc can be due to the fact that at Tc 
the correlations span across the entire lattice. Therefore, one may 
say that the critical transition and collective behaviour in the Ising 
model is detected by Transfer Entropy as a type of 'causality' that 
is symmetric in both directions. It is logical to interpret collective 
behaviour as a type of 'causality' in all directions since information 
is disseminated throughout the whole lattice when it is fully 
connected. This is an important fact to take into account when 
estimating Transfer Entropy on complex systems. 

Amended Ising Model 

In the amended Ising model we introduce an explicit directed 
dependence between the sites A, B and G in order to study how 
well Transfer Entropy is able to detect this causality. We will 
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Figure 11. Transfer Entropy T\jJ on the Ising model of lengths 
Z.=1 0,25,50,1 00 obtained using equation (5). Values continue to 
increase after Tc which is very different from Figure (5). 
doi:10.1371/journal.pone.0099462.g011 
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Figure 12. Transfer Entropy T^^^ on the Ising model of lengths 
Z.=1 0,25,50,1 00 obtained using equation (5). Peaks can be seen at 
respective Tc, similar to Ising model results in Figure (6). 
doi:1 0.1 371 /journal. pone.0099462.g01 2 

define the amended Ising model using the algorithm outlined as 
follows. At each step in the algorithm a site chosen at random will 
be considered for flipping with a certain probability except 
when ^ or 5 is selected where an extra condition needs to be 
fulfilled first before it can be allowed to change (flip). If 
(sG)n _ = 1 , ^ (or B) can be considered for flipping with 
probability as usual, however if (^G)«-?fj= —1, no change is 
allowed. Thus only one state of G {sq = 1 in this case) allows sites A 
and B to be considered for flipping. Therefore, although A and B 
have their own dynamics, their changes still depend on G. 

We simulated the amended Ising model with ^^=10 for 
different lattice lengths L. Figures (7) display the values of 
susceptibility x on the model and the peaks clearly show the 
presence of Tc in our model just like Figure (1) of the Ising model. 
Figures (8) and (9) display the values of the covariance T(A,G) and 
the Mutual Information I(A,G) respectively. We reiterate that our 
correlations reach across the system for L<50 [2,31]. While 
covariance and Mutual Information gives similar results to those of 
the standard Ising model as in Figures (2) and (3), a difference is 
clearly seen in Transfer Entropy values. Figure (10-12) displays 
the contrasts of T^q and on the amended Ising model which 
explicitly indicates the direction of 'causality' G^A. While 
Figure (12) is not very different from Figure (6), Figures (10) and 
(11) are clearly different from their counterparts in the Ising 
model. Figures (4) and (5). Transfer Entropy captures the effect of 
the amendment. 

Furthermore with this amendment, one can utilize Transfer 
Entropy to illustrate the effect of separation in time. The effect of 
deviation from the predetermined causal lag = 1 0, can be 
clearly seen in Figure (13), where the values of Tq^,t 10 reduces 
to 0 but at different rates depending on the deviation of t from t^. 
The further away from tQ, the faster the decrease to 0. Figure (14) 
is simply Figure (13) plotted over different time lags t to illustrate 
how Transfer Entropy correctly and distinctly identified causal lag 
^^ = 10. 

That temperature is a main factor in influencing the strength of 
Transfer Entropy values is apparent in all the figures in this 
section. One can observe, especially in Figure (13), that the 
Transfer Entropy values approaches 0 as they get further away 
from Tc except when the time lag t matches the delay induced 
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Figure 13. J^^ versus T for different time lags i in amended 
Ising model with to 10 and L = 50 using equation (5). The figure 
shows the effect of separation in time. 
doi:1 0.1 371/journal.pone.0099462.g01 3 

(t = to), in which case the Transfer Entropy value stabilizes to a 
certain fixed value as seen in Figure (15). In the vicinity of Tc, the 
lattice is highly correlated thus subsequently leading to higher 
values of Transfer Entropy. The increase and value stabilization 
after Tc is due to the fact that, as temperature increases, the 
probability for all 'flipping considerations' approaches a uniform 
distribution. This leads to transfer of information between site G 
and sites A and B occurring much more frequently at elevated 
temperature. 

Figure (16) and (17) display Transfer Entropy values for the 
Ising model and amended Ising model with tG = \ respectively. 
The figures illustrate the mechanism in which Transfer Entropy 
detects the predefined causal delay. Consider the following 
question: which site 'causes' site A? Firstly we see that T^^^ is 
zero in both figures due to the definition in equation (5). Note that 
by our definition this is only for t = 1 , if T 7^ 1 the Transfer Entropy 
value will be nonzero and also peak at Tc. More importantly we 




Time Lags 

Figure 14. A different view of Figure (1 3) where J^^ versus x for 
different temperatures T is plotted instead. Tc^23. Figure 
highlights time lag detection. 
doi:10.1371/journal.pone.0099462.g014 
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see that is difFerent from T^^^- In Figure (16) of the Ising 
model, the difference is due to separation (distance) in space and 
nearest neighbour interaction in the model, thus T^q^ < since 
G is further away from A than B. But in Figure (17) of the 
amended Ising model, the opposite is true and separation in space 
does not dominate the Transfer Entropy value in this interaction. 
The figure very clearly indicates that G 'causes' A at t = 1 and B 
does not. In other words, in the amended Ising model Transfer 
Entropy identifies G as a source in which one of the target is A, 
whereas in the Ising model the expected nearest neighbour 
dynamics presides. This result is only obtained for measures 



sensitive to transition probabilities. Measures that depend only on 
static probabilities such as covariance. Mutual Information and 
conditional Mutual Information will only give values in accor- 
dance to the underlying nearest neighbour dynamics in both the 
Ising model and the amended Ising model [32]. 

Transfer Entropy, directionality and change 

In order to understand the dynamics of of each site we calculate 
the effective rate of change (ERC) in relation to the transition 
probabilities. Let ERCx = P(^n^ ^n-\) ^or any site X on the 
lattice. Figure (18) illustrates how ERC a and ERCb are equal, as 
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Figure 16. T% T^J and J^^ in the Ising model with L = 50. 

T^jjl > Tg^ due to distance (separation) in space wliere B is closer to A 
than G. The nearest neighbour effect is observed. 
doi:1 0.1 371/journal.pone.0099462.g01 6 



Figure 17. 7^^^, 7^^ and J^^ in the amended Ising model with 

L = 50 and ^g = 1. T^ba<^ga ^l^e to implanted 'causal' lag. The effect 
of separation in space is no longer visible. 
doi:1 0.1 371/journal.pone.0099462.g01 7 
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Temperature 

Figure 1 8. ERC (Expected rate of change) of sites A, B and G on 
amended Ising model with ^c7 = 10 and L = 50. 

doi:10.1371/joumal.pone.0099462.g018 




Temperature 



Figure 19. °1.„^^° on amended Ising model with *e = 10 and 

L = 50 displaying phase-transition like behaviour. 

doi:1 0.1 371/joumal.pone.0099462.g01 9 



expected, and significantly different from ERCq. In Figure (10), 
the corresponding Transfer Entropy in both directions are 
displayed. At higher temperatures, it can be clearly seen that 
T^Q^ is larger than T^^q • However for temperatures near Tc it is 
not as clear and therefore to highlight the relative values we 

calculate — in Figure (19) and Figure (20) where 

ERC A 



rite) 



-GA_^_AG_^Q eRCa=^. We see that this value actually 
ERC A 

gives a clear jump at Tc and remains more or less a constant after 
Tc. Therefore even though Transfer Entropy in neither direction is 
zero, a clear indication of directionality can be obtained. 
Interestingly, the division with ERC brought out the clear phase 
transition-like behaviour that seems to distinguish the situation 
below and above Tc. Referring back to Figure (4) of the 



unamended Ising model we can clearly see that 



A^g) 
^ AG , 



ERCa 



^0 



for any direction in the unamended Ising model. We have 



AtG) 



AtG) 



demonstrated that - 



- is able to cancel out the symmetric 



ERCa 

contribution from the collective behaviour and only captures the 
imposed directed interdependence. 

In his introductory paper [16], Schreiber warns that in certain 
situations due to different information content as well as different 
information rates, the difference in magnitude should not be relied 
on to imply directionality unless Transfer Entropy in one direction 
is 0. We have shown that when collective behaviour is present on 
the Ising model, the value of Transfer Entropy cannot possibly be 
0. We suggest that this is due to fact that collective behaviour is as 
a type of 'causality' (disseminating information in all directions) 
and thus the Transfer Entropy is correctly indicating 'cause' in all 
directions. The clear difference in Transfer Entropy magnitude 
(even at Tc) observed when the model is amended indicates that 
the difference in Transfer Entropy can indeed serve as an indicator 
of directionality in systems with emergent cooperative behaviour. 
We have seen that Transfer Entropy is influenced by the nearest 
neighbour interactions, collective behaviour and the ERC. In the 
next section we use the Random Transition model to further 
investigate how the ERC influences the Transfer Entropy. 



Random Transition Model 

In the amended Ising model we implemented a causal lag as a 
restriction of one variable on another, in a way that a value of the 
source variable will affect the possible changes of the target 
variable. It is this novel concept of implementing 'causality' that 
we will analyze and expand in the Random Transition model. Let 
fix, fJ^Y /iz5 be the independent probabilities for the stochastic 
swaps of the variables X, Y and Z at every time step respectively. 
In addition to that, a restriction is placed on X and Y such that 
they are only allowed to do the stochastic swaps with probability 
fix and fly if the state of Z„_^^ fulfills a certain condition. This 
restriction means that X and Y can only change states if Z is in 
the conditioned state at time step n — tz thus creating a 
'dependence' on Z, analogous to the dependence of A and B on 
G in the amended Ising model. 

However in this model we allow the number of states Ug to be 
more than just two. The purpose of this is twofold, on one hand it 






—L=25 




—i—L=50 




L=100 



2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 
Temperature 

j(^g) _ j(fG) 

Figure 20. on amended Ising model with ^^ = 10 and 

ERCa 

L = 25,50,100. All with phase-transition like jump. 
doi:10.1371/joumal.pone.0099462.g020 
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contributes towards verifying that the behaviours of Transfer 
Entropy observed on the amended Ising model does extend to 
cases where Us > 2. On the other hand, the model also serves to 
highlight different properties of Transfer Entropy as well as the 
very crucial issue of probability estimation that may lead to 
misleading results. The processes are initialized randomly and 
independently. The swapping probabilities are taken to be 

i^x = Mr = Mz= — 5 ^hus enabling Transfer Entropy values to be 

calculated analytically. The transition probability of the Random 
Transition model is as follows. We assume that if a process chooses 
to change it must choose one of the other states equally, thus we 

have that P(X2 = oc\Xi= P^oc^ l])= —^P(X2^Xi), so that the 

ris-l 

marginal and joint probabilities remain uniform but the transition 
probabilities are 



P(X, = a|X„_i=^) = 



P(Yn = OC\Yn-l=P)-- 



1 — fix^ if a = 



r 



1 — /XyQ if a = jS 



and 



P(Z„ = a|Z„_i=^) = 



l—fiz if 01 = P 
fiz if oc^P 



where Q = P(condition fulfilled) such that one can control 
'dependence' on Z by altering Q. 



The relationship between Q and Q 

To understand how the values of /i^ affects the value of T^^ 
need a different variable. Let Q be the probability that the 
condition is fulfilled given current knowledge at time T such that 
g^^^^^^ = P(condition fulfilled | knowledge at time t). The value 
of 6lgi(y) ^ill depend on y, and in our model here, particularly on 
whether or not Z„_^^=y satisfies the condition. One can divide 
the possible states y of all the processes into two sets such that 



Gu = {yeA^Zn -t^=yf ulf ills the condition} and 

Gd = {y^A,Zn-tz =y does not fulfill the condition}. 

Note that \Gu\=nsQ. and |Gi)| =/2^(l — Q) since 
Q = P(condition fulfilled) such that Q can be interpreted as the 
proportion of states of Z that fulfill the condition. Due to 
equiprobability of spins and uniform initial distribution, for any t 
there are only two possible values of 2l^\y)? for yEGjj and one 
for yeGj). Therefore define sgn(y) such that 



to get 



sgn(y)-- 



+ if yeGu 
- if yeGn 

e? ifyeGu 



(11) 



(12) 




Figure 21 . Analytical Transfer Entropy 7^]. versus time lags t of 
the Random Transition model with ns = l (hence ^=2^ 

= 5 in equation (16) where varied but Mz^^ fixed. T^^^ is 

monotonically increasing with respect to fix- T^zx affected by ^x- 
Figure illustrates how the internal dynamics of influences T^^x when 
X is the target variable. Transfer Entropy changes even though external 
influence Q. is constant. 
doi:10.1371/journal.pone.0099462.g021 



Thus g^^^^^y^ = P(condition fulfilled I Z„_T =y) with the sgn{y) as 
in equation (11). 

The relationship between and Q can be defined using the 

formula for total probability P{B)= Y,^P{B\Z = y)P{Z = y). Let 
5= {condition fulfilled} and using the fact that 

P{Zn-x=y)= — , we get that 

Us 

S1 = P(B)= Y,P(B\Z„-,=y)P(Z„.,=y)= ^J2Q^^„(,y (13) 

y S y 

Us — I 

Due to the sole dependence of Z on fi^, f^z — n^ake the 



transition 



uniform 



probability of 
P(Zn = OL\Zn-\= P)= — for any n since we have that 



such that 



P{Zn = a\Zn-l=P)-- 



1-Mz = l- 



1 n. 



1 1 



Us ris 
1 Us-l _\ 
-1 n. 



for any (X,PeA = {\ 

-sgniy) 



,ns}. Consequently, jiz '- 



fig—l 



ifa = l3 
ifa^P 

- also makes 



all values of Q^jL(y) uniform so that equation (13) becomes 



Q = 



1 



) . 

sgn(y) ' 



-risQ] 



>(-) - 

sgn(y) 



-sgniy) • 



(14) 



ris—l 

Therefore on the model when the fiz— ? have that 

ris 

^=Q^sgn(y) 1*^^ ^^^z- And this is why we get Figure (21), 
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0.25 



0.15 




G.05 



Figure 22. Analytical Transfer Entropy Ty^ versus time lags t of 
the Random Transition model with its = 2 (hence ^=2^ ^nd 

tz = 5 in equation (16) where /<x= ^ fixed and ju^ is varied. Only 
at tz = 5, fiz does not effect and values remain constant. For 
^zx^^ at TT^tz, Transfer Entropy is affected by ^z- /^z = 0.25 and 
/^2 = 0.75 coincides. Figure shows how the internal dynamics of Z 
influences rj]. when Z is the source variable. 
doi:1 0.1 371 /journal. pone.0099462.g022 



where T^^t^O only if t = /z since ^=Q^^g„^y^ in equation (16) 
cancels out. 

For any the relationship between Q^^ and Q^^ can be 
derived from equation (13) where 

«^"=EC(>o=EC(.)+EC(r)=iGde?+iGz>ie<-'(i5) 



Note that when ns = 2 (hence ^ = ^^^^ simplifies to 

Transfer Entropy formula on the Random Transition 
model 

Using Q^lLy) equation (12) we have that 



l-/i^Q 



ns-\ 



-sgnjy) 

Q 



if a = jg 
if aT^jg, 



a y 



= y)log 



P{Xn=Ci\Xn-X=P) 



= \{Xn=Xn-x}\Y, 

y 

=«.E 



2 ^xQsgniy) 



l-/i^Q 



Q 



^-^xQ%iy) ^-^xQ%iy) 



-^,(^,-i)E 



7og 



-E 



■yg^(y) 



l-/x^Q 



^sgn{y) 



l-^^Q 



(1 -fee?) tog 



(i-fee*-!Otog 



(16) 



which gives us 



where we used the Bayes theorem i.e 
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(a) Case 1 (b) Case 2 (c) Case 3 

Figure 23. Transfer Entropy T^x versus number of state (number of chosen bins) for Cases 1,2 and 3. fix = f^z= uniformly 

distributed. Analytical values obtained from substituting respective Q values in equation (17). Simulated values are acquired using equation (5) on 
simulated data of varying sample size S (length of time series) where \K= 1000. Error bars are displaying two standard deviation values above and 
two standard deviation below (some bars are very small, it can barely be seen). The aim is primarily to display how choosing ris has to be made 
according to length, S, of available time series. For large S the error bar becomes smaller than the width of the curve. 
doi:10.1371/journal.pone.0099462.g023 



Due to independence, if Y were to be conditioned on X we would 
have that 



P{Yn = a\Yn-l=P.Xn-,=y) P{Yn = a\Yn-l = P) 



P{Yn = a\Yn-l=P) 



P{Yn = (^\Yn-l=P) 



:1. 



Therefore for values other than when X and Y conditioned on Z, 
this ratio will yield 1 . This renders 7"J^ = T^jz ^ ^tk ^ ^XY ^ ^• 
And if we get that T^^x ^ ^^Y Transfer Entropy 

indicates 'causality' or some form of directionality from Z to X 
and Z to F, at time lag t. In a similar manner for Oi,l],yeA we have 
that 



A more thorough treatment of the Random Transition model and 
other methods of Transfer Entropy estimations is given in [32]. 

Understanding 'causality' on the Random Transition 
model 

The unclear meaning of the magnitude of Transfer Entropy is 
one of its main criticism [6,18]. This is partly due to the ERC 
which incorporates both external and internal influences, the 
separation of which is rather unclear. The advantage of 
investigating Transfer Entropy on the Random Transition model 
is that the ERC can be defined in terms of internal and external 
elements i.e. for any variable X we have that 

ERCx = P{X„ ^X„_i)=Y,P(X„ = a\X„.x=P) = n^Q., 



P{Y„ = a\Y„. 



!,Z„_t=y) 



P{Y„ = a.\Y„. 



1 — /iy£) 



ris-l 



Q 



such that T^\r in exactly like equation (16) except that fix is 
replaced with {ly 

When T = tz we have that Q[^gn(y^ is either 0 or 1 since the 
condition was placed Sitn — tz- More specifically we will have that 
Q^^"* = 1 and that Q^!^"^ = 0. Putting these two values in equation 
(16) we obtain 



4f = Q 



-(1-Q) 



,(^z) 



,(^z) Q+ 



1-//^Q 



Q 



(17) 



where fix is the internal transition probability of X and Q 
represents the external influence applied on X. If the condition 
in our model is that Z;^_i = l for X^ and F„ to change 
values then, Q = ^(condition f ulf illed) = P(Zn - 1 = 1 ) so that 
ERCx = liixP(Zn-i = 1) and ERCy = fiyPiZn-i = !)• However, 
for the source Z which has no external influence, Q = 1 and 
consequently ERCz = ^(Z„ ^Zn-\) = fi^. 

When ns = 2, the model essentially replicates the Ising model 
without the collective behaviour effect i.e. far above the Tc where 
the Boltzmann distribution approaches a uniform distribution. 
Consequently, at these temperatures the influence of collective 
behaviour is close to none. One can see in Figure (21) and 
Figure (22) that the fi (hence the ERC) values are indeed key in 
determining the strength of Transfer Entropy. In Figure (21), /J^x 
influences T^x monotonically when every other value is fixed, 
therefore in this case the Transfer Entropy reflects the internal 
dynamics fix rather than the external influence Q. If 'causality' is 
the aim, surely Q is the very thing that makes the relationship 
'causal' and should be the main focus. This is a factor that needs to 
be taken into account when comparing the magnitudes of Transfer 
Entropy. Figure (21) also shows that when fi^ is uniform (since 



ns = 2) hence fi^ - 



1 1 



, one gets that T'^x ^ only if t = 



Hs 2' 

which makes causal lag detection fairly straight forward. However, 
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(a) T^x^ versus ris (b) log T^^^ versus rig 

Figure 24. Transfer Entropy using equation (1 7) on simulated null model with varying sample size or length of time series, S where 
17^ = 1000. Analytical values are all 0. Error bars in the first figure are displaying two standard deviation values above and two standard deviation 
below. For large S the error bar becomes smaller than the width of the curve. In order to use the null model as surrogates, ris still has to be chosen in 
accordance to S. 

doi:10.1371/journal.pone.0099462.g024 



in Figure (22) the effect of varying /i^ can be clearly seen in the 
nonzero values T^]^y^O when T^tz- Nevertheless, the value at 
T = tz seems to be fully determined by fix regardless of fi^ value. 
The mechanism in which fi^ effects T'^x sketched in the 
appendix. 

Therefore one can conclude that when Z is the source ('causal' 
variable) and X is the target (the variable being affected by the 
'causal' link), the value of the Transfer Entropy T'^^ = is 
influenced only by fix but for Ty^tz, T^zx determined by both 
fix and fiz- We have verified that this is indeed the case even when 

> 2 in this model. This should apply to all variables in the model 
and much more generally to any kind of source-target 'causal' 
relationship in this sense. We suspect that this also extends to cases 
when there is more than one source and this will be a subject of 
future research. Thus for causal lag detection purposes, it is clear 
that theoretically Transfer Entropy will attain maximum value at 
the exact causal lag. It is also clear that Transfer Entropy at nearby 
lags can be nonzero due to this single 'causal' relationship. Thus, 
on data sets it is strongly recommended to test for relative lag 
values. 

Transfer Entropy estimations of the Random Transition 
model 

For a classical histogram estimation of Transfer Entropy on real 
data sets [17], one can say that the number of states fis corresponds 
to the number of bins chosen for estimation. The estimations of 
Transfer Entropy for larger ris requires sufficient sample size 
(sufficient length of time series). To illustrate this finite sampling 

effect we set the value Q to three different values; Q = — for Case 

yig — ^ 1 

1, Q= for Case 2 and 0 = - for Case 3. We plot the 

fis 2 ^ 

analytical Transfer Entropy T'^x ^ and its estimations on simulated 

values of varying time series length, S, for all three cases in 

Figure (23). The exact rig is known and incorporated in the 

estimations. 



The observed existence of spurious detection or overestimation 
(finite sampling effects) as in Figure (23), is not uncommon and has 
been reported in relation to causality measures [15,20,33,34]. This 
situation would be even more confusing in situations where lis is 
not known (unfortunately, this is more often than not the case). 
The significant testing (or lack of it) of Transfer Entropy is 
admittedly one of its main criticism. Initially, we have sidestepped 
this issue by implementing Transfer Entropy on relatively small rig 
to easily get statistically significant estimations. In fact of the main 
motivation for the use of the Ising model in the testing of Transfer 
Entropy is to exactly sidestep this issue since no binning is required 
and one can focus on the issue of what exactly does the Transfer 
Entropy measures. However Figure (23) clearly shows that for 
larger ris, some form of validation is required to avoid false 
directionality conclusion. Surrogates have been suggested as a 
form of significant testing for Transfer Entropy [13,20,26,35]. 
Surrogate data sets are synthetically generated data which should 
ideally preserve all properties of the underlying system except the 
one being tested [20]. There are many different types of surrogates 
to serve different purposes [13, 14, 16, 20, 26, 35]. The idea is to 
break the coupling (causal link) but maintain dynamics in hope 
that one can differentiate cause and effect from any other 
dynamics. 

One way to attain surrogates is by generating a null model (in 
the case of the Random Transition model this is simply three 
randomly generated time series) and test the values of Transfer 
Entropy as in Figure (24). Subtracting the null model from the 
values on the Random Transition model is equal to subtracting the 
Transfer Entropy values of both directions as one direction is 
theoretically zero. This is the idea behind the effective and 
corrected Transfer Entropy [15,18]. However this does not quite 
solve the problem as the values may still be negative if the sample 
size is small. There are many other types of corrections [6,13] 
proposed to address this issue involving substraction of the null 
model in some various forms. Nevertheless, as we have seen in 
Figure (19) of the amended Ising model, only by subtracting the 
two directions of Transfer Entropy did we obtain the clear 
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direction as this cancelled out the underlying collective behaviour. 
We suspect that this will work as well for cancelling out other types 
of background effects and succeed in revealing directionality. 

Discussion 

This paper highlights the question of distinguishing interdepen- 
dencies induced by collective behaviour and individual (coupled) 
interactions, in order to understand the inner workings of complex 
systems derived from data sets. These data sets are usually in the 
form of time series that seem to behave essentially as stochastic 
series. It is hence of great interest to understand measures 
proposed to be able to probe 'causality' in view of complex 
systems. Transfer Entropy has been suggested as a good probe on 
the basis of its nonlinearities, exploratory approach and informa- 
tion transfer related interpretation. 

To investigate the behaviour of Transfer Entropy, we studied 
two simplistic models. From results of applying Transfer Entropy 
on the Ising model, we proposed that the collective behaviour is 
also a type of 'causality' in the Wiener-Granger framework but 
highlighted that it should be identified differently from individual 
interactions by illustrating this issue on an amended Ising model. 
The collective behaviour that emerges near criticality may 
overshadow the intrinsic directionality in the system as it is not 
detected by measures such as covariance (correlation) and Mutual 
Information. We showed that by taking into account both 
directions of Transfer Entropy on the amended Ising model, a 
clear direction can be identified. In addition to that, we verified 
that the Transfer Entropy is indeed maximum at the exact causal 
lag by utilizing the amended Ising model. 

By obtaining the phase transition-like difference measure, we have 
shown that the Transfer Entropy is highly dependent on the 
effective rate of change (ERG) and therefore likely to be dependent 
on the overall activity level given by, say, the temperature in 
thermal systems as we demonstrated in the amended Ising model. 
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We believe that identifying these influences is important for our 
understanding of Transfer Entropy with the aim of utilising its full 
potential in uncovering the dynamics of complex systems. The 
mechanism of replicating 'causality' in the amended Ising model 
and the Random Transition model may be used to investigate 
these 'causality' measures even further. Plans for future investiga- 
tions involve indirect 'causality', multiple sources and multiple 
targets. It would also be interesting to understand these measures 
in terms of local and global dynamics in dynamical systems. It is 
our hope that these investigations will help establish these 
'causality' measures as a repertoire of measures for complex 
systems. 
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